wetdog's Repositories

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

⭐ 0 🌐 Public

aiexperiments-bird-sounds

Thousands of bird sounds visualized using machine learning.

⭐ 0 🌐 Public

AQI-Catalonia-Challenge

⭐ 0 🌐 Public

Spatial soundscape synthesis using ray-tracing

⭐ 0 🌐 Public

Data manipulation and transformation for audio signal processing, powered by PyTorch

⭐ 0 🌐 Public

audio-transformers-course

The Hugging Face Course on Transformers for Audio

⭐ 0 🌐 Public

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

⭐ 0 🌐 Public

Collection of notebooks and scripts related to audio processing and machine learning.

⭐ 0 🌐 Public

audioset_experiments

Various experiments with the audioset database and tensorflow

⭐ 0 🌐 Public

⭐ 0 🌐 Public

audio_preprocess

Simple scripts and utils for audio dataset preparation

⭐ 0 🌐 Public

cheatsheet-translation

Translation of VIP cheatsheets for Machine Learning and Deep Learning

⭐ 0 🌐 Public

Music ai team projects

⭐ 0 🌐 Public

⭐ 0 🌐 Public

⭐ 0 🌐 Public

DCASE2017-baseline-system

DCASE 2017 Baseline system

⭐ 0 🌐 Public

DCASE_explorations

Experiments with the DCASE framework and database

⭐ 0 🌐 Public

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

⭐ 0 🌐 Public

This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.

⭐ 0 🌐 Public

Env_soundrecognition

Environmental sound recognition of traffic events such as car, motorcycle, heavy vehicle and horn using librosa and sci-kit learn

⭐ 1 🌐 Public

Sound Level Meter with ESP32 and I2S MEMS microphone

⭐ 0 🌐 Public

Functional programming language for signal processing and sound synthesis

⭐ 0 🌐 Public

gigagan-pytorch

Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs

⭐ 0 🌐 Public

insanely-fast-whisper

⭐ 0 🌐 Public

introtodeeplearning_labs

Lab Materials for MIT 6.S191: Introduction to Deep Learning

⭐ 0 🌐 Public

An autoregressive character-level language model for making more things

⭐ 0 🌐 Public

ICS-43432 mems breakout circular board

⭐ 0 🌐 Public

Models and examples built with TensorFlow

⭐ 0 🌐 Public

musicinformationretrieval.com

Instructional notebooks on music information retrieval.

⭐ 0 🌐 Public

open-tts-tracker

⭐ 0 🌐 Public

orca-embeddings

Extraction pipelines and experiments with audio embeddings (Jose's GSoC work, 2021)

⭐ 0 🌐 Public

p5.sound brings the Processing approach to Web Audio and p5.js. Demos:

⭐ 0 🌐 Public

PAM is a no-reference audio quality metric for audio generation tasks

⭐ 0 🌐 Public

pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper

⭐ 0 🌐 Public

psysound3 getting backroom surgery to work with MIRtoolbox

⭐ 0 🌐 Public

Implementing a fractional octave filterbank for python. Based on Numpy and CFFI.

⭐ 0 🌐 Public

Basic audio and gps data loggers based on Raspberry

⭐ 0 🌐 Public

Basic sound level meter in raspberry using pyfilterbank library

⭐ 1 🌐 Public

Web visualization and listening page for sound datasets.

⭐ 3 🌐 Public

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

⭐ 0 🌐 Public

stanford-tensorflow-tutorials

This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.

⭐ 0 🌐 Public

StyleTTS2_fabric

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

⭐ 0 🌐 Public

surround-soundscape

⭐ 2 🌐 Public

⭐ 0 🌐 Public

Pattern language

⭐ 0 🌐 Public

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

⭐ 0 🌐 Public

TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

⭐ 0 🌐 Public

An exercise to fine tune a fast-pitch model using the coqui tts framework on a specific speaker of the artic dataset

⭐ 0 🌐 Public

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

⭐ 0 🌐 Public

Easily create large video dataset from video urls

⭐ 0 🌐 Public

unofficial vits2-TTS implementation in pytorch

⭐ 0 🌐 Public

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

⭐ 5 🌐 Public

General Speech Restoration

⭐ 0 🌐 Public

wavenext_pytorch

Unofficial implementation of wavenext vocoder

⭐ 53 🌐 Public

a MUSHRA compliant web audio API based experiment software

⭐ 0 🌐 Public

Page to customize the profile header

⭐ 0 🌐 Public

wetdog.github.io

Personal webpage forked from https://academicpages.github.io

⭐ 0 🌐 Public

Starter code for working with the YouTube-8M dataset.

⭐ 0 🌐 Public